AITopics | perceptual distance

Collaborating Authors

perceptual distance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DeepNose: An Equivariant Convolutional Neural Network Predictive Of Human Olfactory Percepts

Shuvaev, Sergey, Tran, Khue, Samoilova, Khristina, Mascart, Cyrille, Koulakov, Alexei

arXiv.org Artificial IntelligenceDec-11-2024

The olfactory system employs responses of an ensemble of odorant receptors (ORs) to sense molecules and to generate olfactory percepts. Here we hypothesized that ORs can be viewed as 3D spatial filters that extract molecular features relevant to the olfactory system, similarly to the spatio-temporal filters found in other sensory modalities. To build these filters, we trained a convolutional neural network (CNN) to predict human olfactory percepts obtained from several semantic datasets. Our neural network, the DeepNose, produced responses that are approximately invariant to the molecules' orientation, due to its equivariant architecture. Our network offers high-fidelity perceptual predictions for different olfactory datasets. In addition, our approach allows us to identify molecular features that contribute to specific perceptual descriptors. Because the DeepNose network is designed to be aligned with the biological system, our approach predicts distinct perceptual qualities for different stereoisomers. The architecture of the DeepNose relying on the processing of several molecules at the same time permits inferring the perceptual quality of odor mixtures. We propose that the DeepNose network can use 3D molecular shapes to generate high-quality predictions for human olfactory percepts and help identify molecular features responsible for odor quality.

artificial intelligence, machine learning, molecule, (18 more...)

arXiv.org Artificial Intelligence

2412.08747

Country:

North America > United States (0.05)
Europe > United Kingdom > Wales (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.47)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation

Tang, Raphael, Zhang, Xinyu, Xu, Lixinyu, Lu, Yao, Li, Wenyan, Stenetorp, Pontus, Lin, Jimmy, Ture, Ferhan

arXiv.org Artificial IntelligenceJun-12-2024

Diffusion models are the state of the art in text-to-image generation, but their perceptual variability remains understudied. In this paper, we examine how prompts affect image variability in black-box diffusion-based models. We propose W1KP, a human-calibrated measure of variability in a set of images, bootstrapped from existing image-pair perceptual distances. Current datasets do not cover recent diffusion models, thus we curate three test sets for evaluation. Our best perceptual distance outperforms nine baselines by up to 18 points in accuracy, and our calibration matches graded human judgements 78% of the time. Using W1KP, we study prompt reusability and show that Imagen prompts can be reused for 10-50 random seeds before new images become too similar to already generated images, while Stable Diffusion XL and DALL-E 3 can be reused 50-200 times. Lastly, we analyze 56 linguistic features of real prompts, finding that the prompt's length, CLIP embedding norm, concreteness, and word senses influence variability most. As far as we are aware, we are the first to analyze diffusion variability from a visuolinguistic perspective. Our project page is at http://w1kp.com

arXiv.org Artificial Intelligence

2406.08482

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Industry: Transportation (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.51)

Add feedback

Disentangling the Link Between Image Statistics and Human Perception

Hepburn, Alexander, Laparra, Valero, Santos-Rodriguez, Raúl, Malo, Jesús

arXiv.org Artificial IntelligenceOct-5-2023

In the 1950s, Barlow and Attneave hypothesised a link between biological vision and information maximisation. Following Shannon, information was defined using the probability of natural images. A number of physiological and psychophysical phenomena have been derived ever since from principles like info-max, efficient coding, or optimal denoising. However, it remains unclear how this link is expressed in mathematical terms from image probability. First, classical derivations were subjected to strong assumptions on the probability models and on the behaviour of the sensors. Moreover, the direct evaluation of the hypothesis was limited by the inability of the classical image models to deliver accurate estimates of the probability. In this work we directly evaluate image probabilities using an advanced generative model for natural images, and we analyse how probability-related factors can be combined to predict human perception via sensitivity of state-of-the-art subjective image quality metrics. We use information theory and regression analysis to find a combination of just two probability-related factors that achieves 0.8 correlation with subjective metrics. This probability-based sensitivity is psychophysically validated by reproducing the basic trends of the Contrast Sensitivity Function, its suprathreshold variation, and trends of the Weber-law and masking.

correlation, perceptual distance, sensitivity, (16 more...)

arXiv.org Artificial Intelligence

2303.09874

Country:

South America > Uruguay > Artigas > Artigas (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.66)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Explainers in the Wild: Making Surrogate Explainers Robust to Distortions through Perception

Hepburn, Alexander, Santos-Rodriguez, Raul

arXiv.org Machine LearningFeb-22-2021

Explaining the decisions of models is becoming pervasive in the image processing domain, whether it is by using post-hoc methods or by creating inherently interpretable models. While the widespread use of surrogate explainers is a welcome addition to inspect and understand black-box models, assessing the robustness and reliability of the explanations is key for their success. Additionally, whilst existing work in the explainability field proposes various strategies to address this problem, the challenges of working with data in the wild is often overlooked. For instance, in image classification, distortions to images can not only affect the predictions assigned by the model, but also the explanation. Given a clean and a distorted version of an image, even if the prediction probabilities are similar, the explanation may still be different. In this paper we propose a methodology to evaluate the effect of distortions in explanations by embedding perceptual distances that tailor the neighbourhoods used to training surrogate explainers. We also show that by operating in this way, we can make the explanations more robust to distortions. We generate explanations for images in the Imagenet-C dataset and demonstrate how using a perceptual distances in the surrogate explainer creates more coherent explanations for the distorted and reference images.

distorted image, distortion, explanation, (16 more...)

arXiv.org Machine Learning

2102.10951

Country: Europe > United Kingdom > England > Bristol (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.35)

Add feedback

Perceptual Adversarial Robustness: Defense Against Unseen Threat Models

Laidlaw, Cassidy, Singla, Sahil, Feizi, Soheil

arXiv.org Machine LearningOct-12-2020

A key challenge in adversarial robustness is the lack of a precise mathematical characterization of human perception, used in the very definition of adversarial attacks that are imperceptible to human eyes. Most current attacks and defenses try to avoid this issue by considering restrictive adversarial threat models such as those bounded by $L_2$ or $L_\infty$ distance, spatial perturbations, etc. However, models that are robust against any of these restrictive threat models are still fragile against other threat models. To resolve this issue, we propose adversarial training against the set of all imperceptible adversarial examples, approximated using deep neural networks. We call this threat model the neural perceptual threat model (NPTM); it includes adversarial examples with a bounded neural perceptual distance (a neural network-based approximation of the true perceptual distance) to natural images. Through an extensive perceptual study, we show that the neural perceptual distance correlates well with human judgements of perceptibility of adversarial examples, validating our threat model. Under the NPTM, we develop novel perceptual adversarial attacks and defenses. Because the NPTM is very broad, we find that Perceptual Adversarial Training (PAT) against a perceptual attack gives robustness against many other types of adversarial attacks. We test PAT on CIFAR-10 and ImageNet-100 against five diverse adversarial attacks. We find that PAT achieves state-of-the-art robustness against the union of these five attacks, more than doubling the accuracy over the next best model, without training against any of them. That is, PAT generalizes well to unforeseen perturbation types. This is vital in sensitive applications where a particular threat model cannot be assumed, and to the best of our knowledge, PAT is the first adversarial defense with this property.

artificial intelligence, machine learning, threat model, (18 more...)

arXiv.org Machine Learning

2006.12655

Country:

North America > United States > Maryland (0.04)
Asia > Middle East > Jordan (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.96)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Towards Visual Distortion in Black-Box Attacks

Li, Nannan, Chen, Zhenzhong

arXiv.org Machine LearningJul-21-2020

Constructing adversarial examples in a black-box threat model injures the original images by introducing visual distortion. In this paper, we propose a novel black-box attack approach that can directly minimize the induced distortion by learning the noise distribution of the adversarial example, assuming only loss-oracle access to the black-box network. The quantified visual distortion, which measures the perceptual distance between the adversarial example and the original image, is introduced in our loss whilst the gradient of the corresponding non-differentiable loss function is approximated by sampling noise from the learned noise distribution. We validate the effectiveness of our attack on ImageNet. Our attack results in much lower distortion when compared to the state-of-the-art black-box attacks and achieves $100\%$ success rate on ResNet50 and VGG16bn. The code is available at https://github.com/Alina-1997/visual-distortion-in-attack.

artificial intelligence, machine learning, visual distortion, (17 more...)

arXiv.org Machine Learning

2007.10593

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (1.00)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

PerceptNet: A Human Visual System Inspired Neural Network for Estimating Perceptual Distance

Hepburn, Alexander, Laparra, Valero, Malo, Jesús, McConville, Ryan, Santos-Rodriguez, Raul

arXiv.org Machine LearningOct-28-2019

Traditionally, the vision community has devised algorithms to estimate the distance between an original image and images that have been subject to perturbations. Inspiration was usually taken from the human visual perceptual system and how the system processes different perturbations in order to replicate to what extent it determines our ability to judge image quality. While recent works have presented deep neural networks trained to predict human perceptual quality, very few borrow any intuitions from the human visual system. To address this, we present PerceptNet, a convolutional neural network where the architecture has been chosen to reflect the structure and various stages in the human visual system. We evaluate PerceptNet on various traditional perception datasets and note strong performance on a number of them as compared with traditional image quality metrics. We also show that including a nonlinearity inspired by the human visual system in classical deep neural networks architectures can increase their ability to judge perceptual similarity.

dataset, distortion, perceptnet, (12 more...)

arXiv.org Machine Learning

1910.12548

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Metric Learning for Phoneme Perception

Lakretz, Yair, Chechik, Gal, Cohen, Evan-Gary, Treves, Alessandro, Friedmann, Naama

arXiv.org Machine LearningSep-20-2018

Metric functions for phoneme perception capture the similarity structure among phonemes in a given language and therefore play a central role in phonology and psycho-linguistics. Various phenomena depend on phoneme similarity, such as spoken word recognition or serial recall from verbal working memory. This study presents a new framework for learning a metric function for perceptual distances among pairs of phonemes. Previous studies have proposed various metric functions, from simple measures counting the number of phonetic dimensions that two phonemes share (place-, manner-of-articulation and voicing), to more sophisticated ones such as deriving perceptual distances based on the number of natural classes that both phonemes belong to. However, previous studies have manually constructed the metric function, which may lead to unsatisfactory account of the empirical data. This study presents a framework to derive the metric function from behavioral data on phoneme perception using learning algorithms. We first show that this approach outperforms previous metrics suggested in the literature in predicting perceptual distances among phoneme pairs. We then study several metric functions derived by the learning algorithms and show how perceptual saliencies of phonological features can be derived from them. For English, we show that the derived perceptual saliencies are in accordance with a previously described order among phonological features and show how the framework extends the results to more features. Finally, we explore how the metric function and perceptual saliencies of phonological features may vary across languages. To this end, we compare results based on two English datasets and a new dataset that we have collected for Hebrew.

artificial intelligence, machine learning, phoneme, (18 more...)

arXiv.org Machine Learning

1809.07824

Country:

Europe (0.67)
North America > United States (0.28)
Asia > Middle East > Israel (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

New Losses for Generative Adversarial Learning

Berger, Victor, Sebag, Michèle

arXiv.org Machine LearningJul-3-2018

Generative Adversarial Networks (Goodfellow et al., 2014), a major breakthrough in the field of generative modeling, learn a discriminator to estimate some distance between the target and the candidate distributions. This paper examines mathematical issues regarding the way the gradients for the generative model are computed in this context, and notably how to take into account how the discriminator itself depends on the generator parameters. A unifying methodology is presented to define mathematically sound training objectives for generative models taking this dependency into account in a robust way, covering both GAN, VAE and some GAN variants as particular cases.

artificial intelligence, discriminator, machine learning, (17 more...)

arXiv.org Machine Learning

1807.0129

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Towards Imperceptible and Robust Adversarial Example Attacks Against Neural Networks

Luo, Bo (The Chinese University of Hong Kong) | Liu, Yannan (The Chinese University of Hong Kong) | Wei, Lingxiao (The Chinese University of Hong Kong) | Xu, Qiang (The Chinese University of Hong Kong)

AAAI ConferencesFeb-8-2018

Machine learning systems based on deep neural networks, being able to produce state-of-the-art results on various perception tasks, have gained mainstream adoption in many applications. However, they are shown to be vulnerable to adversarial example attack, which generates malicious output by adding slight perturbations to the input. Previous adversarial example crafting methods, however, use simple metrics to evaluate the distances between the original examples and the adversarial ones, which could be easily detected by human eyes. In addition, these attacks are often not robust due to the inevitable noises and deviation in the physical world. In this work, we present a new adversarial example attack crafting method, which takes the human perceptual system into consideration and maximizes the noise tolerance of the crafted adversarial example. Experimental results demonstrate the efficacy of the proposed technique.

artificial intelligence, machine learning, pixel, (17 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback